Introduction

Column

City pop vs U.S. Pop

In this portfolio, I will explore the differences and features of the two genres City Pop (CP) and U.S.Pop (UP), originating from Japan and the U.S. respectively. Genre here is a very loose term, as there are sources that indicate that CP is a vibe explored through many different types of genres. Others say it is pop music for city people. However, for simplicity’s sake, I will use the term genre. CP is a type of genre from Japan that appeared the late 70’s and reached its popularity peak in the 80’s. I will be comparing two playlists consisting of three artists, one Japanese group and one US playlist. The Japanese group consist of the artists Taeko Onuki, Miki Matsubara and Anri. The US counterparts are the artists Michael Jackson, Whitney Houston and Madonna. I chose these corpora because I want to explore whether there are distinct differences between the genre of (city) pop as it was in Japan in the 80’s vs the pop that was popular in the western world in the same decade. Japanese CP was influenced by western music, so I expect there to be many similarities in use of sound, instruments and type of rhythms. However, an aspect I am particularly interested whether there is a difference is the prevalence of bass, and tempo. It is also interesting to see whether there are differences in other aspects like timbre. However, I am unsure to what extent they are different, which will be explored here.

The reason I chose this topic is because of my personal tastes. Ever since I was a child, pop has been a significant part of my life and upbringing as this was perhaps the major genre both my parents listened to. Due to rising popularity on the internet, CP has gained a lot of traction and even has spawned new types of sub-genres, e.g., future funk, from what I think is a similarity from Western pop as well as many of the songs ability to sound modern in today’s standards. This previously mention of pop for city people is what attracted me, and I have been listening to it personally since I discovered it 3-4 years ago.

As I have chosen three artists to represent their own (variety) of genres, there might be nuances and representations I am missing. Taeko Onuki, Miki Matsubara and Anri were chosen due to their popularity on Spotify (the amount of general listeners as well as listens to their tracks). I also have to mention that there were personal selections. Nevertheless, they were centered around three albums from each respective artist. The same method was done in choosing the western counterparts. However, the genre(s) is (are) very broad, despite its popularity, and some varieties might have been overlooked. However, their popularity is a strength as many causal listeners will have knowledge of these songs.

Typical, and popular, tracks from the Japanese playlist are:

These songs are typical in the sense that there are prominent use of basslines and clear rhythms, very stereotypical pop and have many timbre features to them, as well as many sound layers, e.g., instruments etc.

The western counterparts have typical tracks like:

These last three tracks especially has the typical and distinct features of pop of the 80’s, namely the sharp drums and the heavily synthesized piano sounds and, what I think, an almost like a “dreamy” sound to them.

Atypical songs from both playlists can include:

In order to explore eventual differences and features of this corpus, I will first start with a classification model with a random forest. Then I will explore track level features between the two genres, focusing on what the classification model labels my corpus. After I will go more into the musical moments such as timbre and chroma, focusing on self-similarity matrices and chromagrams and chordograms. Finally, I will go into depth about what this portfolio has explored and conclude what can be derived from this.

Column

Playlist CP

Column

Playlist UP

Classification

The classifier is better than expected at categorizing each genre


# A tibble: 2 × 3
  class    precision recall
  <fct>        <dbl>  <dbl>
1 City Pop     0.719  0.742
2 U.S. Pop     0.733  0.710

In order to compute the model, I did capped the playlists at 31 songs in each group. The model stays around over 20 correctly predicted songs in each genre. By this, I can assume that the model is decent at classifying exactly what City Pop and what U.S Pop is. The prediction for both genres stay around 0.7-0.8 and accuracy stay around the same values. In the next header, I will see what kind of labels were the most important in the classification of these playlists, and determine what labels where most important in this classification.

From these labels the track level feature loudness, and timbre coefficients are the most important labels when classifying CP and UP.


# A tibble: 2 × 3
  class    precision recall
  <fct>        <dbl>  <dbl>
1 City Pop     0.75   0.774
2 U.S. Pop     0.767  0.742
          Truth
Prediction City Pop U.S. Pop
  City Pop       24        8
  U.S. Pop        7       23

In the random forest model, the accuracy and predictions also stay around the same values as the knn-model. Indicating that both of the models are decent at classifying the genres.

Regarding the feature selection, depending on the runs three different labels move around the three first places in the feature plot. These are the labels loudness, c1 and c11. The importance of loudness and the timbre coefficient c1 is quite humorous as it was said in one of the lectures that this timbre vector is the rough equivalent to loudness. After those, primarily timbre features are the ones that are of the most importance. This is quite interesting, as I did mention at the very start of the course (without any knowledge of the terms of music) said that City Pop has a lot of layers to them. This was meant as “there is a lot going on” and it sounds different.

This is probably why, according to the sources mentioned in the introduction, the genre CP is majorly based on vibes and urban feelings - rather than distinct features from other music moments.

Graphs

CP is louder than UP


Here one can see the temporal and power features between the two genres. As one can see, there is a definite preference for a higher volume in the City Pop group than the other gorup. This graph was made as a direct consequence of the labeling done in the previous section. As one can see, there is a distinct difference in loudness between the two genres, as it seems that while both of them tend to stay around the same beats per minute, i.e., 120-125. Which coincidentally corresponds well to the study by Moelants (2002), which says that humans seem to prefer this tempo.

However, one can tell that there are many items that do not correspond to this as there are plots between the 90-120 area in CP and from 90-150 in UP. One could perhaps say that UP have more items directly corresponding to 120 bpm.

Nevertheless, UP, in terms of loudness, stop at just above -9 dB. What is very clear is that CP does not stop at this as multiple items is above -6 dB. This, if I am interpreting it correctly, does mean that CP is more loud than UP.

Other track level features single out some outliers.


Here is a plot of the effect of energy on danceability, with size of the plots as well as the band around the line indicating the tempo of the songs. One can see that there is more of a linear trend with US Pop, indicating that up until around energy of 0.5 there seems to be a correlation between energy and danceability. Tempo, however, do seem to have no pattern at first glance. Whereas for the City Pop playlist, there seem to be a slight curve in the beginning of the graph, but overall there is a very even trend of the effect of energy on danceability. At first glance, there also seem to be no indication that there is a trend for tempo.

However, once can tell that U.S. Pop seem to have more of a positive linear correlation between energy and danceability - more than its eastern counterpart in any case.

In both groups, there are two outliers that especially draw one’s eye - which is “Billie Jean” by Michael Jackson, and “横顔” by Taeko Onuki. They both have, in comparison to other songs, very low energy while at the same time having high danceability.

Chromagrams of Typicals

Column

Chromagrams of 4:00A.M. and I Wanna Dance..

Column

Similar in terms of chroma

Both “4:00A.M.” and “I Wanna Dance with Somebody (Who Loves Me)” are both mentioned as popular and typical songs in my corpus. As depicted in the chromagrams to the left. As one can see. The same chromas are used in a very similar manner with an exception in some areas, like F#/Gb and when they use the C chroma most. For Taeko Onuki, this is in the beginning of the song, whereas for Whitney’s, it is mostly in the end of the song. The F chroma is also used more. Overall, it seems like both of the songs seem very homogeneous in terms of pitch classes. There is not a lot of variation neither inter-, nor intra-song. Which is quite representative as the segments in the songs are very stereotypical with verses, choruses and repetition of these segments. Thus, in this instance the chromagrams do not particularly gain any insights in the differences in features. However, it does tell us that they are (can be) quite similar.

On the other hand, it should be mentioned that for Taeko’s song there is a good minute and a half dedicated to a rather simplistic guitar solo. However, based on this chromagram, this is not necessarily the case as they seem very homogenous. So perhaps, chromagrams does not necessarily work very well for these items in my corpus.

Keys & Chords

Chordogram of CARNAVAL shows many 7th chords


Here is a chordogram of the song “CARNAVAL” by Taeko Onuki. It is a song that has clear sound in terms of synthesizers, and the chordograms should be able to represent this accurately. Chordograms show similarity between bars and the chords inn the song, meaning that the darker the band the more likely this chord was played at this time in the song. However, many of the bands seem to be dark - which I am not entirely sure is representative. To give the benefit of the doubt, there are many parts of the song that the synthesizer has solos.

In the bands many of them are dark around the major 7th chords, however also the minor 7th chords are dark. Especially around the end of the song. This could be in accordance to sources where it is indicated that CP employs these types of chords.

In the next slide, a chordogram of the song “You’re Still My Man” by Whitney Houston will be presented and compared.

Not many differences between CP and Up as far as I can tell


The song is a very calm love song, also equipped with a synthesizer, like the previous song. This should be a good way to compare the two genres. Here the bands are very dark around the same chords as the CP song. Both the major and minor 7th bands are dark throughout the song. “You’re Still My Man” also has two modulations. This is represented well in the chordogram.

However, in terms of differences between CP and UP there aren’t any between these songs at least.

Self-Similarity Matrices

Column

Self-Similarity Matrices of Material Girl and Mayonaka no Door/Stay With Me

Column

CP uses sound color in more novel ways than UP

Here, two sets of self-similarity matrices in both chroma and timbre are compared between playlists in my corpus. These are the songs “Mayonaka no door / Stay With Me” by Miki Matsubara and “Material Girl” by Madonna. These were chosen as they are quite typical songs in my corpus. As ne can see, there are distinct differences between the two songs in both timbre and chroma.

From the borderline black diagonal patterns in the chroma SSM one can see that the CP song definitely has more repetitive patterns in pitch classes than the UP one. There are novel items introduced, as one can see by the bright colors creating a checkerboard pattern, but this is more present in “Material Girl”. The latter, according to the chroma SSM, shows less repetitions in pitch classes and more novel stuff. This is especially apparent at around 200 seconds in the song, where a brightly colored band shows up, indicating it is entirely novel. This indicates that CP is fond of repetitive pitch classes, perhaps even more than UP.

However, if one looks at the timbre based SMMs, the story is entirely different. Here, there are many instances in the CP song where timbre has novelty at many different parts of the song as well. This is highly contrasted in the UP song, as one can see there are a lot of repetition in timbre and way less than CP. This supports the notion of the classifier that timbre might be the thing that really distinguishes CP from UP.

Conclusions/Summary UPDATED

Differences in City Pop vs US Pop

Throughout this portfolio, I have tried to explore the genres City Pop and U.S. Pop to the best of my ability. This is to strengthen and explore differences between the two.

The classifier did a better job than expected to provide labels for the two genres and managed to uncover that loudness and timbre coefficients were the most effective in differentiating the corpus. Using these, I was able to visualize that CP is louder than UP. Furthermore, the timbre, as shown by the self-similarity matrices showed support for the fact that CP has different qualities in timbre from UP. However, the use of pitch as well as tempo, as shown by the loudness and tempo, as well as the chromagrams and chordograms that there aren’t many differences in terms of these musical moments.

As far as a more general conclusion, one can say that CP definitely has substance to the arguments mentioned in the introduction. Namely that the “urban” vibe, that so many are talking about, is a major part of what makes CP, well, CP. Otherwise, as far as I can tell they are pretty similar.

However, major limitations to this “study” should be once again reiterated. Due to factors such as time and processing abilities, only a couple songs at a time could be compared - leaving missed representations, and perhaps badly compared items. However, given these limitations I think I did a pretty good job.